AITopics | average improvement

Collaborating Authors

average improvement

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Cross-lingual Retrieval for Iterative Self-Supervised Training

Neural Information Processing SystemsDec-23-2025, 19:26:05 GMT

Recent studies have demonstrated the cross-lingual alignment ability of multilingual pretrained language models. In this work, we found that the cross-lingual alignment can be further improved by training seq2seq models on sentence pairs mined using their own encoder outputs. We utilized these findings to develop a new approach --- cross-lingual retrieval for iterative self-supervised training (CRISS), where mining and training processes are applied iteratively, improving cross-lingual alignment and translation ability at the same time. Using this method, we achieved state-of-the-art unsupervised machine translation results on 9 language directions with an average improvement of 2.4 BLEU, and on the Tatoeba sentence retrieval task in the XTREME benchmark on 16 languages with an average improvement of 21.5% in absolute accuracy. Furthermore, CRISS also brings an additional 1.8 BLEU improvement on average compared to mBART, when finetuned on supervised machine translation downstream tasks.

cross-lingual retrieval, iterative self-supervised training, name change, (3 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.85)

Add feedback

Skillful joint probabilistic weather forecasting from marginals

Alet, Ferran, Price, Ilan, El-Kadi, Andrew, Masters, Dominic, Markou, Stratis, Andersson, Tom R., Stott, Jacklynn, Lam, Remi, Willson, Matthew, Sanchez-Gonzalez, Alvaro, Battaglia, Peter

arXiv.org Artificial IntelligenceJun-13-2025

Machine learning (ML)-based weather models have rapidly risen to prominence due to their greater accuracy and speed than traditional forecasts based on numerical weather prediction (NWP), recently outperforming traditional ensembles in global probabilistic weather forecasting. This paper presents FGN, a simple, scalable and flexible modeling approach which significantly outperforms the current state-of-the-art models. FGN generates ensembles via learned model-perturbations with an ensemble of appropriately constrained models. It is trained directly to minimize the continuous rank probability score (CRPS) of per-location forecasts. It produces state-of-the-art ensemble forecasts as measured by a range of deterministic and probabilistic metrics, makes skillful ensemble tropical cyclone track predictions, and captures joint spatial structure despite being trained only on marginals.

artificial intelligence, machine learning, modeling & simulation, (16 more...)

arXiv.org Artificial Intelligence

2506.10772

Country: Asia (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Enhancing Disinformation Detection with Explainable AI and Named Entity Replacement

González-Silot, Santiago, Montoro-Montarroso, Andrés, Cámara, Eugenio Martínez, Gómez-Romero, Juan

arXiv.org Artificial IntelligenceFeb-7-2025

The automatic detection of disinformation presents a significant challenge in the field of natural language processing. This task addresses a multifaceted societal and communication issue, which needs approaches that extend beyond the identification of general linguistic patterns through data-driven algorithms. In this research work, we hypothesise that text classification methods are not able to capture the nuances of disinformation and they often ground their decision in superfluous features. Hence, we apply a post-hoc explainability method (SHAP, SHapley Additive exPlanations) to identify spurious elements with high impact on the classification models. Our findings show that non-informative elements (e.g., URLs and emoticons) should be removed and named entities (e.g., Rwanda) should be pseudo-anonymized before training to avoid models' bias and increase their generalization capabilities. We evaluate this methodology with internal dataset and external dataset before and after applying extended data preprocessing and named entity replacement. The results show that our proposal enhances on average the performance of a disinformation classification method with external test data in 65.78% without a significant decrease of the internal test performance.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2502.04863

Country:

North America > United States > California > San Francisco County > San Francisco (0.28)
Africa > Rwanda (0.24)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(13 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Media > News (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.32)
Health & Medicine > Therapeutic Area > Immunology (0.32)

Technology:

Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.92)

Add feedback

Enhancing Financial Fraud Detection with Human-in-the-Loop Feedback and Feedback Propagation

Kadam, Prashank

arXiv.org Artificial IntelligenceNov-7-2024

Human-in-the-loop (HITL) feedback mechanisms can significantly enhance machine learning models, particularly in financial fraud detection, where fraud patterns change rapidly, and fraudulent nodes are sparse. Even small amounts of feedback from Subject Matter Experts (SMEs) can notably boost model performance. This paper examines the impact of HITL feedback on both traditional and advanced techniques using proprietary and publicly available datasets. Our results show that HITL feedback improves model accuracy, with graph-based techniques benefiting the most. We also introduce a novel feedback propagation method that extends feedback across the dataset, further enhancing detection accuracy. By leveraging human expertise, this approach addresses challenges related to evolving fraud patterns, data sparsity, and model interpretability, ultimately improving model robustness and streamlining the annotation process.

data mining, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2411.05859

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Europe > Slovenia > Central Slovenia > Municipality of Ljubljana > Ljubljana (0.04)
(3 more...)

Genre: Research Report > New Finding (0.86)

Industry:

Law Enforcement & Public Safety > Fraud (1.00)
Information Technology (1.00)
Banking & Finance (0.69)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

Cross-lingual Retrieval for Iterative Self-Supervised Training

Neural Information Processing SystemsOct-9-2024, 15:14:25 GMT

average improvement, cross-lingual retrieval, iterative self-supervised training

Neural Information Processing Systems

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.91)

Add feedback

Clover-2: Accurate Inference for Regressive Lightweight Speculative Decoding

Xiao, Bin, Gui, Lujun, Su, Lei, Chen, Weipeng

arXiv.org Artificial IntelligenceJul-31-2024

Large Language Models (LLMs) frequently suffer from inefficiencies, largely attributable to the discord between the requirements of auto-regressive decoding and the architecture of contemporary GPUs. Recently, regressive lightweight speculative decoding has garnered attention for its notable efficiency improvements in text generation tasks. This approach utilizes a lightweight regressive draft model, like a Recurrent Neural Network (RNN) or a single transformer decoder layer, leveraging sequential information to iteratively predict potential tokens. Specifically, RNN draft models are computationally economical but tend to deliver lower accuracy, while attention decoder layer models exhibit the opposite traits. This paper presents Clover-2, an advanced iteration of Clover, an RNN-based draft model designed to achieve comparable accuracy to that of attention decoder layer models while maintaining minimal computational overhead. Clover-2 enhances the model architecture and incorporates knowledge distillation to increase Clover's accuracy and improve overall efficiency. We conducted experiments using the open-source Vicuna 7B and LLaMA3-Instruct 8B models. The results demonstrate that Clover-2 surpasses existing methods across various model architectures, showcasing its efficacy and robustness.

attention decoder, clover-2, draft model, (14 more...)

arXiv.org Artificial Intelligence

2408.00264

Country:

North America > United States > New York > New York County > New York City (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Augmentations vs Algorithms: What Works in Self-Supervised Learning

Morningstar, Warren, Bijamov, Alex, Duvarney, Chris, Friedman, Luke, Kalibhat, Neha, Liu, Luyang, Mansfield, Philip, Rojas-Gomez, Renan, Singhal, Karan, Green, Bradley, Prakash, Sushant

arXiv.org Artificial IntelligenceMar-8-2024

We study the relative effects of data augmentations, pretraining algorithms, and model architectures in Self-Supervised Learning (SSL). While the recent literature in this space leaves the impression that the pretraining algorithm is of critical importance to performance, understanding its effect is complicated by the difficulty in making objective and direct comparisons between methods. We propose a new framework which unifies many seemingly disparate SSL methods into a single shared template. Using this framework, we identify aspects in which methods differ and observe that in addition to changing the pretraining algorithm, many works also use new data augmentations or more powerful model architectures. We compare several popular SSL methods using our framework and find that many algorithmic additions, such as prediction networks or new losses, have a minor impact on downstream task performance (often less than $1\%$), while enhanced augmentation techniques offer more significant performance improvements ($2-4\%$). Our findings challenge the premise that SSL is being driven primarily by algorithmic improvements, and suggest instead a bitter lesson for SSL: that augmentation diversity and data / model scale are more critical contributors to recent advances in self-supervised learning.

architecture, augmentation, momentum encoder, (16 more...)

arXiv.org Artificial Intelligence

2403.05726

Country:

North America > United States > Maryland > Prince George's County > College Park (0.04)
North America > United States > Illinois (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.81)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Multi-Resolution Diffusion for Privacy-Sensitive Recommender Systems

Lilienthal, Derek, Mello, Paul, Eirinaki, Magdalini, Tiomkin, Stas

arXiv.org Artificial IntelligenceNov-20-2023

While recommender systems have become an integral component of the Web experience, their heavy reliance on user data raises privacy and security concerns. Substituting user data with synthetic data can address these concerns, but accurately replicating these real-world datasets has been a notoriously challenging problem. Recent advancements in generative AI have demonstrated the impressive capabilities of diffusion models in generating realistic data across various domains. In this work we introduce a Score-based Diffusion Recommendation Module (SDRM), which captures the intricate patterns of real-world datasets required for training highly accurate recommender systems. SDRM allows for the generation of synthetic data that can replace existing datasets to preserve user privacy, or augment existing datasets to address excessive data sparsity. Our method outperforms competing baselines such as generative adversarial networks, variational autoencoders, and recently proposed diffusion models in synthesizing various datasets to replace or augment the original data by an average improvement of 4.30% in Recall@$k$ and 4.65% in NDCG@$k$.

dataset, ndcg, sdrm, (13 more...)

arXiv.org Artificial Intelligence

2311.03488

Country:

North America > United States > New York > New York County > New York City (0.05)
North America > United States > California > Santa Clara County > San Jose (0.04)
North America > United States > Alaska > Anchorage Municipality > Anchorage (0.04)
(15 more...)

Genre:

Research Report (1.00)
Overview (0.93)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Predicting the First Response Latency of Maintainers and Contributors in Pull Requests

Khatoonabadi, SayedHassan, Abdellatif, Ahmad, Costa, Diego Elias, Shihab, Emad

arXiv.org Artificial IntelligenceNov-13-2023

The success of a Pull Request (PR) depends on the responsiveness of the maintainers and the contributor during the review process. Being aware of the expected waiting times can lead to better interactions and managed expectations for both the maintainers and the contributor. In this paper, we propose a machine-learning approach to predict the first response latency of the maintainers following the submission of a PR, and the first response latency of the contributor after receiving the first response from the maintainers. We curate a dataset of 20 large and popular open-source projects on GitHub and extract 21 features to characterize projects, contributors, PRs, and review processes. Using these features, we then evaluate seven types of classifiers to identify the best-performing models. We also perform permutation feature importance and SHAP analyses to understand the importance and impact of different features on the predicted response latencies. Our best-performing models achieve an average improvement of 33% in AUC-ROC and 58% in AUC-PR for maintainers, as well as 42% in AUC-ROC and 95% in AUC-PR for contributors compared to a no-skilled classifier across the projects. Our findings indicate that PRs submitted earlier in the week, containing an average or slightly above-average number of commits, and with concise descriptions are more likely to receive faster first responses from the maintainers. Similarly, PRs with a lower first response latency from maintainers, that received the first response of maintainers earlier in the week, and containing an average or slightly above-average number of commits tend to receive faster first responses from the contributors. Additionally, contributors with a higher acceptance rate and a history of timely responses in the project are likely to both obtain and provide faster first responses.

contributor, maintainer, response latency, (15 more...)

arXiv.org Artificial Intelligence

2311.07786

Country:

North America > Canada > Quebec > Montreal (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.66)

Industry: Information Technology (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Toward smart production: Machine intelligence in business operations

#artificialintelligenceJun-3-2022, 12:25:08 GMT

As the superintendent of Vistra Corp's Luminant Martin Lake Power Plant, Wayne Brown is an expert in power generation. Vistra is the largest competitive power producer in the United States, operating power plants in 12 states and producing more than 39 gigawatts of electricity--enough to power nearly 20 million homes. The company has been on a journey to drive operational excellence across its generation portfolio. Launched in 2016, its Operational Performance Initiative has driven a step-change improvement in the efficiency of its assets, generating hundreds of millions in incremental EBITDA along the way. This article is a collaborative effort by Duane S. Boning, Vijay D'Silva, Pete Kimball, Bruce Lawler, Retsef Levi, and Ingrid Millan, representing views from McKinsey's Operations Practice and the Massachusetts Institute of Technology's Machine Intelligence for Manufacturing and Operations program. To maintain and improve its position, Vistra is continually looking for the tools, technologies, and approaches that will help it achieve the next level of performance. Most recently, the company has turned to digital and analytics, including machine intelligence (MI). It measures how much electricity is generated for each ton of fuel consumed by the plant.

enabler, intelligence, use case, (15 more...)

#artificialintelligence

Country: North America > United States > Massachusetts (0.25)

Genre: Research Report > New Finding (0.69)

Industry: Energy > Power Industry (0.74)

Technology: Information Technology > Artificial Intelligence > Applied AI (1.00)

Add feedback